Chaining algorithms for multiple genome comparison
نویسندگان
چکیده
Given n fragments from k > 2 genomes, Myers and Miller showed how to find an optimal global chain of colinear non-overlapping fragments in O(n log n) time and O(n log n) space. For gap costs in the L1-metric, we reduce the time complexity of their algorithm by a factor log 2 n log logn and the space complexity by a factor log n. For the sum-of-pairs gap cost, our algorithm improves the time complexity of their algorithm by a factor logn log logn . A variant of our algorithm finds all significant local chains of colinear non-overlapping fragments. These chaining algorithms can be used in a variety of problems in comparative genomics: the computation of global alignments of complete genomes, the identification of regions of similarity (candidate regions of conserved synteny), the detection of genome rearrangements, and exon prediction.
منابع مشابه
Chaining Algorithms for Alignment of Draft Sequence
In this paper we propose a chaining method that can align a draft genomic sequence against a finished genome. We introduce the use of an overlap tree to enhance the state information available to the chaining procedure in the context of sparse dynamic programming, and demonstrate that the resulting procedure more accurately penalizes the various biological rearrangements. The algorithm is teste...
متن کاملEfficient mapping of large cDNA/EST databases to genomes: A comparison of two different strategies
This paper presents a comparison of two strategies for cDNA/EST mapping: The seed-and-extend strategy and the fragment-chaining strategy. We derive theoretical results on the statistics of fragments of type maximal exact match. Moreover, we present efficient fragment-chaining algorithms that are simpler than previous ones. In experiments, we compared our implementation of the fragment-chaining ...
متن کاملChaining Algorithms and Applications in Comparative Genomics
University of Ulm, Germany 1.1 Motivation: Comparison of Whole Genomes . . . . . . 1-1 1.2 Basic Definitions and Concepts . . . . . . . . . . . . . . . . . . . . . 1-4 1.3 A Global Chaining Algorithm without Gap Costs 1-5 The Basic Chaining Algorithm • Applications 1.4 Incorporating Gap Costs into the Algorithm . . . . . . 1-10 Costs in the L1 Metric • Costs in the L∞ Metric 1.5 Variations . . ...
متن کاملGraph-Theoretic Modelling of the Domain Chaining Problem
Methods for the clustering of genes into homologous families (sets of genes descending from a single gene in an ancestral organism) are susceptible to the inappropriate merging of unrelated families, called domain chaining. We give formal criteria for the chaining e↵ect by defining multiple alternative clique relaxation and path relaxation models and the relationships among them, involving di↵e...
متن کاملA comprehensive benchmark between two filter-based multiple-point simulation algorithms
Computer graphics offer various gadgets to enhance the reconstruction of high-order statistics that are not correctly addressed by the two-point statistics approaches. Almost all the newly developed multiple-point geostatistics (MPS) algorithms, to some extent, adapt these techniques to increase the simulation accuracy and efficiency. In this work, a scrutiny comparison between our recently dev...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Discrete Algorithms
دوره 3 شماره
صفحات -
تاریخ انتشار 2005